Important data structures

[!Whitebg] Data available from anywhere in a mounting session Arbitrary hierarchy definition and directory data structure


Caching

  • Construct entire idToDataOnIdData dictionary from all data in lTSI
  • Construct the filter list and connect their invalidation function, if they are a tip, set their update function
  • Input the ids into the root directory causing the id filtration through the blue line on the right. When a tip is hit, the blue process pauses and the yellow begins. The yellow process applies additional organisation and also constructs the dict[ namePlusExts(UniqueToDirectory), id] on index 1 of the directories runtime variable

[!whitebg] AHDMountingFiltrationCachingProcess


Old structure and new byte saving and retrieval mechanism

[!whitebg] Old mounting function and data communication organisation

So if current places want byte data associated with a communicationId then they call getBytes. If there isnt currently any byte data associated with that communicationId then getBytes calls getBytesFromStore in order to load in from lTSI.

I want saveBytes to check if its versioned and do a delayed save if so whilst also saving that to the latestBytes dict for getBytesFromStore to read from

getBytes -> getBytesWithCommunicationId saveBytes

So the problem is that succsessive opperations often make delta changes inneffective and to make them more effective one must group those changes in time and execute them after the group has finished

Issues currently unsolved here is that queries during that group before the save need to recieve data from the updated data even if it hasnt been pushed to the store yet

So the getBytesFromStore and saveBytes functions are always manipulating the bytes, whether in a FSFileRep or not they take the bytes and return the bytes

This means that we might be able to do the grouping in them If saveBytes is called and the data is versioned then we should queue the byte data and only after the timeout should it be saved. getBytesFromStore should read this byte data whenever called if it is present

This byte data needs to be accessible to the tracking thread. Without directly sharing memory we can use pipes, queues and managers in the multiprocessing library

This can be done with only one thread, each versioned piece of data can store its pending bytes in a piece of memory

The thread can loop. It loops from the first recorded version grouping, if the difference in time between the current time and the attempted save time is larger then it will take the bytes and

In succinction, the main thread. When it encounters a request to save byte data, if it is not versioned, it carries out the save If it is versioned then:

  • If the tracker is not running on its loop then it is started, the time of attempted save is recorded along with the requested byte data.
  • If the tracker is running then we update the time of attempted save as well as the byte data

This requires a mechanism for the main thread to update the specific entry or the main thread can just dump all of the requests into a pipe and then the secondary thread can pop out all of the requests, if there are multiple for one piece of data then it can use the most recent one. It can the re add any nescessary entries that it has taken out

I think the ltsd data will need to be passed back and forth in order to save unless the other thread consistently handles saving and also loading

[!whitebg] New mounting byte saving and retrieval organisation


Runtime directory structure, filter list, map updating

Another thing to solve here is how to do succsessive alterations to the cached ids in the filter lists as well as ...

So to list what alterations are made to the cached representation of the store, then assign them to symbols for quick reference and then to explain what must be done upon each alteration to facilitate expected behaviour

idfuse initiatordescription
qcreateaddition of new id to the store which should be filtered within current rules to be at a specified path
wmkdiradd a directory into the aHD with a specified name at the specified path
ermdirremove a directory from the aHD along with all recursive contained directories. Remove all data recusively held from the store, could either remove from the store or remove from the root filter, ( depends on what the root filter is, may need a special inclusion tag if this is desired)
runlinkthis should either remove the specified data from the store or should remove it from the root filter
trenamerename of a "file" in place
yrenamerename of a directory in place
urenamemovement of a "file"
irenamemovement of a directory
orenamerename and movement of a "file"
prenamerename and movement of a directory

q: need to be able to be given an aHD with filters, additional filtration rules and a directory within it and be able to produce data which will be filtered to that directory. That would reqiure additional definition alongside the filters which defines how to produce data that will pass it or to devise a way to describe both the filtration and creation of filtration matching data simoultaneously. Both of those solutions sound like too much work currently for the return so we can just create data that will match the has tags setup as well as the additional filtration rules. This piece of datas prescence in other directories may change as a result and therefore it should be incrementally run through the aHD filter list and map updates If a file cant be filtered to be in a specified dir then an error can be raised NOTUNIQ PERMISSIONERROR ADDRNOTAVAIL

w: need to place that directory within the aHD, if no additional filtration options are enabled then i need to construct the runtime information and the map.

  • If tips only is enabled, the newly created directory will not have any child directories so these dont need to be considered when created the new directory however should the new directory filter to any ids then all containing directories maps will need to be updated

  • If exclusive is enable then the new directories maps will be affected by the prior directories in the yellow order and directories after the new one will be affected. This means the exclusivity set should be reconstructed from prior directories and then if the new directory filters to anything, the directories after the new one should have their maps updated in the yellow order. Well really they just need to have the new directories filtered - existing exclusions removed from their maps

  • If both roots only and exlusive is enabled, then if the new directory filters to any ids then the prior exclusivity in the yellow order should be calculated and used within the new directories additional filtration. Then all new additions to the exclusivity set should be removed from all directories maps following the new directory in the yellow order. This includes non tip exclusions as all containing directories are after the new directory in the yellow order.

    this can be done by creating the runtime data for the new directory and then forcing map recreation for:

    • none if no extra filtration
    • those above and including the new if tip-s only
    • all if exclusive regardless of tip-s only status

e: to avoid the envisioned unlink operation from removing from directories that are already being removed, we can ammass a list of all contained data of the removed directory, remove it and any contained directories from the aHD, deal with the repurcusions and then remove the contained data. So the repurcusions of removing a directory, none if no additional filtration. If tips only then all data that would have been excluded from containing directories will be removed anyway. If exclusive then there are no data ids that would have been added to the exclusive set that wont be removed anyway. So there are no repurcusions that wont get cleaned up by removing the data. No directories after in yellow will want the data as it is removed well one would expect in current filesystem-s for the contained data to be removed, that can be done if the data is contained in other places, what is to be done? well those pieces can either be ignored or removed, well just removed for now error-s: not a dir

r: lets only envision deleting data from the store for now. So when a piece of data is unlinked it should be deleted from the store, it should be removed from any maps that it is present in and also the idToDataOnIdData dictionary. A list of referncing directories opn the IdToDataOnIdData dictionary could be useful here.

t: Need to recalculate tha datas storage method within the store. Need to see what it can be stored as within the store and also the nescessity of the fsfilewrapper, We can use the function in HierarchyAndMounting. Also the namePlusExts needs to be updated on idToDataOnIdData and also on each of the maps which reference it, it may be useful to maintain a list of referencing directories within idToDataOnIdData.

y: If the directory has custom filters then nothing needs to be done apart from change the name. If the directory has no custom filters then all ids filtered underneath it should have their tag which matches the directory name changed to the new name. Now what happens to other directories that contain those renamed files. They may have been doing other things with that tag information in their filters and so they will need refiltering. When searching for directories that will need to be updated, if we are not distinguishing between those with custom filters or not then we must conclude that the root directory is the place at which refiltering should occur as we are not sure if this new tag will be matched against the root directory. If we are distinguishing then for each . One possible solution to avoid a complete recalculation would be to implement incremental changes to the operating ids worked upon by a filter list. This would reqiure updates to the filtration mechanism so existing caches can be used in the new calculation and then updated.

  • If tips only is enabled then any containing directory of the refiltering directory will need to reevaluate their ... Is this worth speculation now that it seems we must do complete recalculations, well we are only doing a complete recalculation if we arent regarding the custom filter prescence of directories containing the updated ids then we use the root causing an update to all of the filter list results which prompts a full map update

the action taken here is general to action that must be taken upon the bulk update of tags

u: this faces the same challenge as q, we need to be able to place an id at a specific location. So for the specified directory path for all directories with no filter we can add those tags. In regards to additional filtration rules, these may exclude the file form actually endin up where it was desired to be, this is fine. Given the new tags other directories may need to hold this in their map, incrementaly filtering this through the entire aHD seems like a solution to achieving this Depending on

i: When moving a directory we must take all of the directories above the moved directories initial position which are not using custom filters and ammass their names as ⌿. We then take all of the directories above the moved directories desired position which are not using custom filters and ammass their names as ⤘. On all the data recursively contained within the moved directory, tags matching ⌿ should be removed and ⤘ tags should be added. The directory is then moved within the aHD, the filter list is relinked The changed data needs to be incrementally run through the aHD Then to completely/ not incrementally update the additional filtration rules of the following If exclusivity is active then z must be run then the moved directories map must be updated along with all its contained. if only tips only is active then x must be run if exclusivity is active then c must be run

This isnt precise and is hard to think through

o: the same action as t must be taken initially where the datas storage method is reevaluated and applied. Then we must change the namePlusExts in idToDataOnIdData, remove the id from any referencing maps, readjust the tags as in n, and then incrementally filter the data through the aHD

p: ...

commonalities thought to be needed:

  • z need to calculate excluded data up to an arbitrary directory following the yellow path
  • x need to run additional filtration rules and reconstruct the map on all containing directories of a given directory from the bottom up
  • c need to run additional filtration rules and reconstruct the map on all directories following a given directory in the yellow order
  • v the need for incrementally adjusting the operating ids of a filter list
  • b Support for incremental updating
  • n replacing a list of datas tags with that of all names of directories which do not have custom filters in a given directory path and then running an incremental filter through the aHD for them
  • m takes a list of ids, removes them from all directories maps referenced in l, removes the idToDataOnIdData entry, removes them from the store
  • l storage of a list of each directory which references a specific id on idToDataOnIdData

Succinct rewrite referrencing symolized commonalities:

does it affect filter list filtration, if so incrementally or wholly all filter list filtration affects additional filtration if it doesnt affect filter list filtratio then does it affect additional filtration

q: create data with the desired name using the HierarchyAndMounting function to determine saved format. Then run n with only the new data and the desired path

w: create the directory with the specified name at the specified path in the aHD, no custom filter. Then construct the filter list Filter the ids Link the filter list. If exlusivity is enabled then run z up to the new directory Then update the mapping If only tips only is enabled then run x If exlusvity is enabled then run c

e: Amass unique list of ids under the proposed removed directory, remove that directory from the aHD and from l for each contained id and then run m over all ammassed ids

r: runs m on the specified id

t: run getPythonInstanceFromBytes from HierarchyAndMounting with CHECK_IF_NEEDED using the new filename. Update namePlusExts ( on each directory map which references the id in l), ( in idToDataOnIdData)

y: Rename the directories name If the directory is using custom filters then we are done, otherwise For all child ids which have a tag matching the old directory name, replace that tag with the new name. Then run an incremental refresh of the entire aHD with those changed ids ( b)

u: run n on the data

i: ...

o: ...

p: ...


Caching speed improvement

Another thing to solve is the speed of caching. This fits in with the need to do bulk operations upon a store If a bulk operation retrieves data then it can return somthing which can grab the results incrementally this incremental object can also have an option to dump all into memory at once The stores can optionally implement my filters and can throw an error if the filter isnt supported

Or if the filter isnt supported they can do the default, pull into memory and use python filter implementation There should then be an option to either throw an error on unsupported filters or use

# Example mongodb implementation
def filterHandling( self, filters, errorOnUnsupported= False)-> tuple[ bool, Optional[ list[ str]]]:
	filterQuery= dict()
	for filter in filters:
		if type( filter) in self.supportedFilters:
			currentFilterQuery: dict= self.supportedFilters[ type( filter)]( filter)
			# Might not be as easy as updating the filter dict with each conversion
			filterQuery.update( currentFilterQuery)

	# Do we use the query to find ids here, do we take some sort of argument to say what aspects of the data to return here, do we just return the filterQuery as queryData
	# It should become more clear as we go on
	return filterQuery


found, filterHandling= getDataForType( store, "handleFilters")
if found:
	filterHandling( store, filter, )